Natural Language Processing of Mathematical Texts in mArachna
نویسندگان
چکیده
mArachna is a technical framework designed for the extraction of mathematical knowledge from natural language texts. mArachna avoids the problems typically encountered in automated-reasoning based approaches through the use of natural language processing techniques taking advantage of the strict formalized language characterizing mathematical texts. Mathematical texts possess a strict internal structuring and can be separated into text elements (entities) such as definitions, theorems etc. These entities are the principal carriers of mathematical information. In addition, Entities show a characteristic coupling between the presented information and their internal linguistic structure, well suited for natural language processing techniques. Taking advantage of this structure, mArachna extracts mathematical relations from texts and integrates them into a knowledge base. Identifying sub elements within new elements of information with already stored mathematical concepts defines the structure of the knowledge base. As a result, mArachna generates an ontology of the analyzed mathematical texts. In response to user queries, parts of the knowledge base are visualized using OWL. In particular, mArachna aims to provide an overview of single fields of mathematics, as well as showing intra-field relations between mathematical objects and concepts. The following paper gives an overview of the theoretical basis and the technologies applied within the mArachna framework.
منابع مشابه
mArachna: A Classification Scheme for Semantic Retrieval in eLearning Environments in Mathematics
Automated extraction of information from natural language texts remains a largely unsolved problem. Scientific texts in general and mathematical texts in particular, are characterised by the use of complex language constructs, often requiring extensive background knowledge for comprehension. Fortunately, many mathematical texts contain special types of text elements, such as definitions and the...
متن کاملOn the Ea-style Integrated Processing of Self-contained Mathematical Texts
In this paper 1 , we continue to develop our approach to theorem proof search in the EA-style, that is theorem proving in the framework of integrated processing mathematical texts written in a 1st-order formal language close to the natural language used in mathematical papers. This framework enables constructing a sound and complete goal-oriented sequent-type calculus with \large-block" inferen...
متن کاملThe DeLiVerMATH Project - Text Analysis in Mathematics
A high-quality content analysis is essential for retrieval functionalities but the manual extraction of key phrases and classification is expensive. Natural language processing provides a framework to automatize the process. Here, a machine-based approach for the content analysis of mathematical texts is described. A prototype for key phrase extraction and classification of mathematical texts i...
متن کاملThe Naproche Project: Controlled Natural Language Proof Checking of Mathematical Texts
The Naproche project (NAtural language PROof CHEcking) studies the semiformal language of mathematics (SFLM) as used in journals and textbooks from the perspectives of linguistics, logic and mathematics. A central goal of Naproche is to develop and implement a controlled natural language (CNL) for mathematical texts which can be transformed automatically into equivalent first order formulas by ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008